智能论文笔记

Worldwide AI Ethics: a review of 200 guidelines and recommendations for AI governance

Nicholas Kluge Corrêa , Camila Galvão , James William Santos , Carolina Del Pino , Edson Pontes Pinto , Camila Barbosa , Diogo Massmann , Rodrigo Mambrini , Luiza Galvão , Edmund Terem

分类：人工智能

2022-06-23

在过去的十年中，许多组织制作了旨在从规范意义上进行标准化的文件，并为我们最近和快速的AI开发促进指导。但是，除了一些荟萃分析和该领域的批判性评论外，尚未分析这些文档中提出的思想的全部内容和分歧。在这项工作中，我们试图扩展过去研究人员所做的工作，并创建一种工具，以更好地数据可视化这些文档的内容和性质。我们还提供了通过将工具应用于200个文档的样本量获得的结果的批判性分析。

translated by 谷歌翻译

3DSGrasp: 3D Shape-Completion for Robotic Grasp

Seyed S. Mohammadi , Nuno F. Duarte , Dimitris Dimou , Yiming Wang , Matteo Taiana , Pietro Morerio , Atabak Dehban , Plinio Moreno , Alexandre Bernardino , Alessio Del Bue

分类：机器人 | 人工智能

2023-01-02

Real-world robotic grasping can be done robustly if a complete 3D Point Cloud Data (PCD) of an object is available. However, in practice, PCDs are often incomplete when objects are viewed from few and sparse viewpoints before the grasping action, leading to the generation of wrong or inaccurate grasp poses. We propose a novel grasping strategy, named 3DSGrasp, that predicts the missing geometry from the partial PCD to produce reliable grasp poses. Our proposed PCD completion network is a Transformer-based encoder-decoder network with an Offset-Attention layer. Our network is inherently invariant to the object pose and point's permutation, which generates PCDs that are geometrically consistent and completed properly. Experiments on a wide range of partial PCD show that 3DSGrasp outperforms the best state-of-the-art method on PCD completion tasks and largely improves the grasping success rate in real-world scenarios. The code and dataset will be made available upon acceptance.

translated by 谷歌翻译

MTNeuro: A Benchmark for Evaluating Representations of Brain Structure Across Multiple Levels of Abstraction

Jorge Quesada , Lakshmi Sathidevi , Ran Liu , Nauman Ahad , Joy M. Jackson , Mehdi Azabou , Jingyun Xiao , Christopher Liding , Matthew Jin , Carolina Urzay

分类：计算机视觉 | 机器学习

2023-01-01

There are multiple scales of abstraction from which we can describe the same image, depending on whether we are focusing on fine-grained details or a more global attribute of the image. In brain mapping, learning to automatically parse images to build representations of both small-scale features (e.g., the presence of cells or blood vessels) and global properties of an image (e.g., which brain region the image comes from) is a crucial and open challenge. However, most existing datasets and benchmarks for neuroanatomy consider only a single downstream task at a time. To bridge this gap, we introduce a new dataset, annotations, and multiple downstream tasks that provide diverse ways to readout information about brain structure and architecture from the same image. Our multi-task neuroimaging benchmark (MTNeuro) is built on volumetric, micrometer-resolution X-ray microtomography images spanning a large thalamocortical section of mouse brain, encompassing multiple cortical and subcortical regions. We generated a number of different prediction challenges and evaluated several supervised and self-supervised models for brain-region prediction and pixel-level semantic segmentation of microstructures. Our experiments not only highlight the rich heterogeneity of this dataset, but also provide insights into how self-supervised approaches can be used to learn representations that capture multiple attributes of a single image and perform well on a variety of downstream tasks. Datasets, code, and pre-trained baseline models are provided at: https://mtneuro.github.io/ .

translated by 谷歌翻译

Current State of Community-Driven Radiological AI Deployment in Medical Imaging

Vikash Gupta , Barbaros Selnur Erdal , Carolina Ramirez , Ralf Floca , Laurence Jackson , Brad Genereaux , Sidney Bryson , Christopher P Bridge , Jens Kleesiek , Felix Nensa

分类：人工智能

2022-12-29

Artificial Intelligence (AI) has become commonplace to solve routine everyday tasks. Because of the exponential growth in medical imaging data volume and complexity, the workload on radiologists is steadily increasing. We project that the gap between the number of imaging exams and the number of expert radiologist readers required to cover this increase will continue to expand, consequently introducing a demand for AI-based tools that improve the efficiency with which radiologists can comfortably interpret these exams. AI has been shown to improve efficiency in medical-image generation, processing, and interpretation, and a variety of such AI models have been developed across research labs worldwide. However, very few of these, if any, find their way into routine clinical use, a discrepancy that reflects the divide between AI research and successful AI translation. To address the barrier to clinical deployment, we have formed MONAI Consortium, an open-source community which is building standards for AI deployment in healthcare institutions, and developing tools and infrastructure to facilitate their implementation. This report represents several years of weekly discussions and hands-on problem solving experience by groups of industry experts and clinicians in the MONAI Consortium. We identify barriers between AI-model development in research labs and subsequent clinical deployment and propose solutions. Our report provides guidance on processes which take an imaging AI model from development to clinical implementation in a healthcare institution. We discuss various AI integration points in a clinical Radiology workflow. We also present a taxonomy of Radiology AI use-cases. Through this report, we intend to educate the stakeholders in healthcare and AI (AI researchers, radiologists, imaging informaticists, and regulators) about cross-disciplinary challenges and possible solutions.

translated by 谷歌翻译

True Detective: A Challenging Benchmark for Deep Abductive Reasoning \\in Foundation Models

Maksym Del , Mark Fishel

分类：自然语言处理

2022-12-20

Large language models (LLMs) have demonstrated strong performance in zero-shot reasoning tasks, including abductive reasoning. This is reflected in their ability to perform well on current benchmarks in this area. However, to truly test the limits of LLMs in abductive reasoning, a more challenging benchmark is needed. In this paper, we present such a benchmark, consisting of 191 long-form mystery stories, each approximately 1200 words in length and presented in the form of detective puzzles. Each puzzle includes a multiple-choice question for evaluation sourced from the "5 Minute Mystery" platform. Our results show that state-of-the-art GPT models perform significantly worse than human solvers on this benchmark, with an accuracy of 28\% compared to 47\% for humans. This indicates that there is still a significant gap in the abductive reasoning abilities of LLMs and highlights the need for further research in this area. Our work provides a challenging benchmark for future studies on reasoning in language models and contributes to a better understanding of the limits of LLMs' abilities.

translated by 谷歌翻译

UnitY: Two-pass Direct Speech-to-speech Translation with Discrete Units

Hirofumi Inaguma , Sravya Popuri , Ilia Kulikov , Peng-Jen Chen , Changhan Wang , Yu-An Chung , Yun Tang , Ann Lee , Shinji Watanabe , Juan Pino

分类：自然语言处理

2022-12-15

Direct speech-to-speech translation (S2ST), in which all components can be optimized jointly, is advantageous over cascaded approaches to achieve fast inference with a simplified pipeline. We present a novel two-pass direct S2ST architecture, {\textit UnitY}, which first generates textual representations and predicts discrete acoustic units subsequently. We enhance the model performance by subword prediction in the first-pass decoder, advanced two-pass decoder architecture design and search strategy, and better training regularization. To leverage large amounts of unlabeled text data, we pre-train the first-pass text decoder based on the self-supervised denoising auto-encoding task. Experimental evaluations on benchmark datasets at various data scales demonstrate that UnitY outperforms a single-pass speech-to-unit translation model by 2.5-4.2 ASR-BLEU with 2.83x decoding speed-up. We show that the proposed methods boost the performance even when predicting spectrogram in the second pass. However, predicting discrete units achieves 2.51x decoding speed-up compared to that case.

translated by 谷歌翻译

RT-1: Robotics Transformer for Real-World Control at Scale

Anthony Brohan , Noah Brown , Justice Carbajal , Yevgen Chebotar , Joseph Dabis , Chelsea Finn , Keerthana Gopalakrishnan , Karol Hausman , Alex Herzog , Jasmine Hsu

分类：机器人 | 人工智能 | 自然语言处理 | 计算机视觉 | 机器学习

2022-12-13

By transferring knowledge from large, diverse, task-agnostic datasets, modern machine learning models can solve specific downstream tasks either zero-shot or with small task-specific datasets to a high level of performance. While this capability has been demonstrated in other fields such as computer vision, natural language processing or speech recognition, it remains to be shown in robotics, where the generalization capabilities of the models are particularly critical due to the difficulty of collecting real-world robotic data. We argue that one of the keys to the success of such general robotic models lies with open-ended task-agnostic training, combined with high-capacity architectures that can absorb all of the diverse, robotic data. In this paper, we present a model class, dubbed Robotics Transformer, that exhibits promising scalable model properties. We verify our conclusions in a study of different model classes and their ability to generalize as a function of the data size, model size, and data diversity based on a large-scale data collection on real robots performing real-world tasks. The project's website and videos can be found at robotics-transformer.github.io

translated by 谷歌翻译

Quantum Clustering with k-Means: a Hybrid Approach

Alessandro Poggiali , Alessandro Berti , Anna Bernasconi , Gianna Del Corso , Riccardo Guidotti

分类：机器学习

2022-12-13

Quantum computing is a promising paradigm based on quantum theory for performing fast computations. Quantum algorithms are expected to surpass their classical counterparts in terms of computational complexity for certain tasks, including machine learning. In this paper, we design, implement, and evaluate three hybrid quantum k-Means algorithms, exploiting different degree of parallelism. Indeed, each algorithm incrementally leverages quantum parallelism to reduce the complexity of the cluster assignment step up to a constant cost. In particular, we exploit quantum phenomena to speed up the computation of distances. The core idea is that the computation of distances between records and centroids can be executed simultaneously, thus saving time, especially for big datasets. We show that our hybrid quantum k-Means algorithms can be more efficient than the classical version, still obtaining comparable clustering results.

translated by 谷歌翻译

Towards Next Generation of Pedestrian and Connected Vehicle In-the-loop Research: A Digital Twin Simulation Framework

Zijin Wang , Ou Zheng , Liangding Li , Mohamed Abdel-Aty , Carolina Cruz-Neira , Zubayer Islam

分类：机器人

2022-12-08

Digital Twin is an emerging technology that replicates real-world entities into a digital space. It has attracted increasing attention in the transportation field and many researchers are exploring its future applications in the development of Intelligent Transportation System (ITS) technologies. Connected vehicles (CVs) and pedestrians are among the major traffic participants in ITS. However, the usage of Digital Twin in research involving both CV and pedestrian remains largely unexplored. In this study, a Digital Twin framework for CV and pedestrian in-the-loop simulation is proposed. The proposed framework consists of the physical world, the digital world, and data transmission in between. The features for the entities (CV and pedestrian) that need digital twined are divided into external state and internal state, and the attributes in each state are described. We also demonstrate a sample architecture under the proposed Digital Twin framework, which is based on Carla-Sumo Co-simulation and Cave automatic virtual environment (CAVE). The proposed framework is expected to provide guidance to the future Digital Twin research, and the architecture we build can serve as the testbed for further research and development of ITS applications on CV and pedestrian.

translated by 谷歌翻译

Design and Planning of Flexible Mobile Micro-Grids Using Deep Reinforcement Learning

Cesare Caputo , Michel-Alexandre Cardin , Pudong Ge , Fei Teng , Anna Korre , Ehecatl Antonio del Rio Chanona

分类：人工智能

2022-12-08

Ongoing risks from climate change have impacted the livelihood of global nomadic communities, and are likely to lead to increased migratory movements in coming years. As a result, mobility considerations are becoming increasingly important in energy systems planning, particularly to achieve energy access in developing countries. Advanced Plug and Play control strategies have been recently developed with such a decentralized framework in mind, more easily allowing for the interconnection of nomadic communities, both to each other and to the main grid. In light of the above, the design and planning strategy of a mobile multi-energy supply system for a nomadic community is investigated in this work. Motivated by the scale and dimensionality of the associated uncertainties, impacting all major design and decision variables over the 30-year planning horizon, Deep Reinforcement Learning (DRL) is implemented for the design and planning problem tackled. DRL based solutions are benchmarked against several rigid baseline design options to compare expected performance under uncertainty. The results on a case study for ger communities in Mongolia suggest that mobile nomadic energy systems can be both technically and economically feasible, particularly when considering flexibility, although the degree of spatial dispersion among households is an important limiting factor. Key economic, sustainability and resilience indicators such as Cost, Equivalent Emissions and Total Unmet Load are measured, suggesting potential improvements compared to available baselines of up to 25%, 67% and 76%, respectively. Finally, the decomposition of values of flexibility and plug and play operation is presented using a variation of real options theory, with important implications for both nomadic communities and policymakers focused on enabling their energy access.

translated by 谷歌翻译